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Abstract — Through several studies, it has been highlighted that 
mobility patterns in mobile networks are driven by human behav- 
iors. This effect has been particularly observed in intermittently 
connected networks like DTN (Delay Tolerant Networks). Given 
that common social intentions generate similar human behavior, 
it is relevant to exploit this knowledge in the network protocols 
design, e.g. to identify the closeness degree between two nodes. 
In this paper, we propose a temporal link prediction technique 
for DTN which quantifies the behavior similarity between each 
pair of nodes and makes use of it to predict future links. We 
attest that the tensor-based technique is effective for temporal 
link prediction applied to the intermittently connected networks. 
The validity of this method is proved when the prediction is 
made in a distributed way (i.e. with local information) and its 
performance is compared to well-known link prediction metrics 
proposed in the literature. 

Index Terms — Link prediction, wireless networks, intermittent 
connections, tensor, Katz measure, behavior similarity, DTN 

I. Introduction 

In recent years extensive research has addressed challenges 
and problems raised in mobile, sparse and intermittently con- 
nected networks (i.e. DTN). In this case, forwarding packets 
greatly depends on the occurrence of contacts. Since the 
existence of links is crucial to deliver data from a source 
to a destination, the contacts and their properties emerge as 
a key issue in designing efficient communication protocols 
|1|. Obviously, the occurrence of links is determined by the 
behavior of the nodes in the network [2]. It has been widely 
shown in 01, El that human mobility is directed by social 
intentions and reflects spatio-temporal regularity. A node can 
follow other nodes to a specific location (spatial level) and may 
bring out a behavior which may be regulated by a schedule 
(temporal level). The social intentions that govern the behavior 
of mobile users have also been observed through statistical 
analyses in O, Q by showing that the distribution of inter- 
contact times follow a truncated power law. 

With the intention of improving the performance of intermit- 
tently connected wireless network protocols, it is paramount 
to track and understand the behavior of the nodes. We aim 
to propose an approach that analyzes the network statistics, 
quantifies the social relationship between each pair of nodes 
and exploits this measure as a score which indicates if a link 
would occur in the immediate future. 
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In this paper, we adapt a tensor-based link prediction 
algorithm successfully designed for data-mining (6l, Q. Our 
proposal records the network structure for T time periods 
and predicts links occurrences for the (T + l) th period. 
This link prediction technique is designed through two steps. 
First, tracking time-dependent network snapshots in adjacency 
matrices which form a tensor. Second, applying of the Katz 
measure (8) inspired from sociometry. To the best of our 
knowledge, this work is the first to perform the prediction 
technique in a distributed way. The assessment of its effi- 
ciency can be beneficial for the improvement or the design of 
communication protocols in mobile, sparse and intermittently 
connected networks. 

The paper is organized as follows: Section 2 presents the 
related work that highlights the growing interest to the social 
analysis and justifies the recourse to the tensors and to the 
Katz measure to perform predictions. In Section 3, we describe 
the two main steps that characterize our proposal. Section 4 
details simulation scenarios used to evaluate the tensor-based 
prediction approach, analyzes the obtained results and assesses 
its efficiency. Finally, we conclude the paper in Section 5. 

II. Related Work 

Social Network Analysis (SNA) 0, (TO) and ad-hoc net- 
working have provided new perspectives for the design of net- 
work protocols ifTTIl . fl2l . Il3l . These protocols aim to exploit 
the social aspects and relationship features between the nodes. 
Studies conducted in the field of SNA have mainly focused 
on two kinds of concepts: the most well-known centrality 
metrics suggested in (9), lfT4l . fT5l . |[T6l and the community 
detection mechanisms proposed in fTTlL (TOIL (TOIL l9l . From 
this perspective, several works have tried to develop synthetic 
models that aim to reproduce realistic moving patterns 0, 
l20l . Nonetheless, the study done in Q has underlined the 
fact that synthetic models cannot faithfully reproduce human 
behavior because these synthetic models are only location- 
driven and they do not track social intentions explicitly. 

In their survey, Katsaros et al. ifTOl have underlined the 
limits of these protocols when the network topology is time- 
varying. The main drawback comes down to their inability to 
model topology changes as they are based on graph theory 
tools. To overcome this limit, tensor-based approaches have 
been used in some works to build statistics on the behavior 



of nodes in wireless networks over time as in l2T1l . Thakur 
et al. |4] have also developed a model using a collapsed 
tensor that tracks user's location preferences (characterized by 
probabilities) with a considered time granularity (week days 
for example) in order to follow the emergence of "behavior- 
aware" delay tolerant networks closely. 

As previously mentioned, tracking the social ties between 
network entities enables us to understand how the network is 
structured. Such tracking has led to the design of techniques 
for link prediction. Link prediction in social networks has 
been addressed in data mining applications as in (6l, (7). 
Concerning link prediction in community-based communica- 
tion networks, [22] has highlighted salient measures that allow 
link occurrence between network users to be predicted. These 
metrics determine if a link occurrence is likely by quantifying 
the degree of proximity of two nodes (Katz measure 0, the 
number of common neighbors, Adamic-Adar measure l23lL 
Jaccard's coefficient l24l . l25ll . ...) or by computing the 
similarity of their mobility patterns (spatial cosine similarity, 
co-location rate, . . . ). 

In this paper, we propose a link prediction technique that 
tracks the temporal network topology evolution in a tensor and 
computes a metric in order to characterize the social-based 
behavior similarity of each pair of nodes. Some approaches 
have addressed the same problem in data-mining in order to 
perform link prediction. Acar et al. (6| and Dunlavy et al. (7) 
have provided detailed methods based on matrix and tensor 
factorizations for link prediction in social networks such as the 
DBLP data set (26). These methods have been successfully 
applied to predict a collaboration between two authors by 
recording the structure of relationships over a tracking period. 
Moreover, they have highlighted the use of the Katz measure 
1 8], which can be seen as a behavior similarity metric, by 
assigning a link prediction score for each pair of nodes. The 
efficiency of the Katz measure in link prediction has been also 
demonstrated in l6l. 171. l22l l28l 

III. Description of the Tensor Based Prediction 
Method 

It has been highlighted that a human mobility pattern 
shows a high degree of temporal and spatial regularity, and 
each individual is characterized by a time-dependent mobility 
pattern and a trend to return to preferred locations 13, 0, 
[4]. In this paper, we propose an approach that aims to exploit 
similar behavior of nodes in order to predict link occurrence 
referring to the social closeness. 

To quantify the social closeness between each pair of 
nodes in the network, we use the Katz measure inspired 
by sociometry. This measure aims at quantifying the social 
distance between people inside a social network. We also need 
to use a structure that records link occurrence between each 
pair of nodes over a certain period of time in order to perform 
the similarity measure computation. The records represent the 
network behavior statistics in time and space. To this end, 
a third-order tensor is considered. A tensor Z consists of 
a set of slices and each slice corresponds to an adjacency 



matrix of the network tracked over a given period of time p. 
After the tracking phase, we reduce the tensor into a matrix 
(or collapsed tensor) which expresses the weight of each link 
according to its lifetime and its recentness. A high weight 
value in this matrix denotes a link whose corresponding nodes 
share a high degree of closeness. We apply the Katz measure 
to the collapsed tensor to compute a matrix of scores S that 
not only considers direct links but also indirect links (multi- 
hop connections). The matrix of scores expresses the degree 
of similarity of each pair of nodes according to the spatial 
and the temporal levels. The higher the score is, the better the 
similarity pattern gets. Therefore, two nodes that have a high 
similarity score are more likely to have a common link in the 
future. 

A. Notation 

Scalars are denoted by lowercase letters, e.g., a. Vectors 
are denoted by boldface lowercase letters, e.g., a. Matrices are 
denoted by boldface capital letters, e.g., A. The r th column of 
a matrix A is denoted by a r . Higher-order tensors are denoted 
by bold Euler script letters, e.g., T. The n th frontal slice of a 
tensor T is denoted T n . The i th entry of a vector a is denoted 
by a(i), element (z, j) of a matrix A is denoted by A(z, j), 
and element (i, j, k) of a third-order tensor T is denoted by 
Ti(j,fc). 

B. Matrix of Scores Computation 

The computation of the similarity scores is modeled through 
two distinct steps. First, we store the inter-contact between 
nodes in a tensor Z and reduce it to a matrix X called the 
collapsed tensor. In a second step, we compute the matrix of 
similarity scores S relying on the matrix X (cf. Fig. [TJ. 

We consider that the data is collected into the tensor Z. The 
slice Z p (i, j) describes the status of a link between a node i 
and a node j during a time period [(p — 1) • t,p • t[ (p>0) 
where Z p (i,j) is 1 if the link exists during this period and 
otherwise. The tensor is formed by a succession of adjacency 
matrices Zi to Zt where the subscript letters designate the 
observed period. To collapse the data into one matrix as done 
in (6), 13, we choose to compute the collapsed weighted tensor 
(which is the most efficient way to collapse the data as shown 
in (6) and (7)). The links structure is considered over time and 
the more recent the adjacency matrix is, the more weighted 
the structure gets. The collapsed weighted tensor is computed 
as following: 

T 

x(ij) = J2(i-o) T ~ t Zp(hj) (D 

p=l 

where the matrix X is the collapsed weighted tensor of Z, 
and is a parameter used to adjust the weight of recentness 
and is between and 1. 

As Katz measure quantifies the network proximity between 
two nodes and given that there are "social relationships" 
between nodes in networks with intermittent connections, it 
is challenging to exploit this measure and to apply it on the 
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(3) Compute the Katz Scores (cf. eq. 3 



Fig. 1: Example of the matrix S computation 



collected data. Therefore, the Katz score of a link between a 
node i and a node j as given by |8|: 
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where (3 is a user defined parameter strictly superior to zero, (3 £ 
is the weight of a t hops path length and P^{i,j) represents 
the number of paths of length i that join the node i to the 
node j. 

It is clear that the longer the path is, the lower the weight 
gets. There is also another formulation to compute Katz scores 
by means of collapsed weighted tensor as detailed previously. 
We quantify the proximity between nodes relying on the paths 
that separate a pair of nodes and the weights of the links that 
form these paths. Then, the score matrix S can be rewritten 
as: 
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Where I is the identity matrix and X is the collapsed weighted 
tensor obtained. 

We depict as previously mentioned in Fig. [T] an example 
which details the two major steps described before. We take 
into consideration a network consisting of 4 nodes and having 
a dynamic topology over 4 time periods and we highlight how 
similarity scores are obtained. The parameters and f3 are 
respectively set to 0.2 and 0.001 for the example and later for 
the simulations. We have looked after the values to choose 
for these two parameters through several simulations and we 
have found that such a setting make possible the convergence 
of the Katz measure as explained in |29l . In this example, we 
assume that all nodes have the full knowledge of the network 
structure. 

IV. Performance Evaluation and Simulation 
Results 

To evaluate how efficient is the tensor-based link prediction 
in intermittently connected wireless networks, we consider 
two real traces. In the following, we firstly present the traces 
used for the link prediction evaluation. Then, we expose 
the corresponding results, analyze the effectiveness of the 



prediction method and compare its performance to those of 
well-known link prediction metrics proposed in the literature. 

A. Simulation Traces 

We consider two real traces to evaluate the link predic- 
tion approach. We exploit them to construct the tensor by 
generating adjacency matrices for several tracking periods. 
For each case, we track the required statistics about network 
topology within T periods. We also consider the adjacency 
matrix corresponding to the period T+l as a benchmark to 
evaluate Katz scores matrix. We detail, in the following, the 
used traces. 

• First Trace: Dartmouth Campus trace: we choose the 
trace of 01/05/06 [ 30 1 and construct the tensor slices 
relying on SYSLOG traces between 8 a.m. and 3 p.m. 
(7 hours). The number of nodes is 1018 and the number 
of locations (i.e. access points) is 128. 

• Second Trace: MIT Campus trace: we focus on the 
trace of 07/23/02 [31] and consider also the events 
between 8 a.m. and 3 p.m. to build up the tensor. The 
number of nodes is 646 and the number of locations (i.e. 
access points) is 174. 

For each scenario, we generate adjacency matrices correspond- 
ing to a different tracking periods t\ 5, 10, 30 and 60 minutes. 
To record the network statistics over 7 hours, the tensor has 
respectively a number of slices T equal to 84, 42, 14 and 
7 slices (for the case where t=5 minutes, it is necessary to 
have 84 periods to cover 7 hours). We take into account both 
centralized and distributed cases for the computation of scores. 

• The Centralized Computation: the centralized way 
assumes that there is a central entity which has full 
knowledge of the network structure at each period and 
applies Katz measure to the global adjacency matrices. 

• The Distributed Computation: each node has a limited 
knowledge of the network structure. We assume that 
a node is aware of its two-hop neighborhood. Hence, 
computation of Katz measures is performed on a local- 
information-basis. 

B. Performance Analysis 

As described in the previous section, we apply the link 
prediction method to the traces with considering different 
tensor slice periods in both centralized and distributed cases. 
In order to assess the efficiency of this method, we consider 
several link prediction scenarios (according to the trace, the 
tensor slice period and the scores computation way) and we 
use different evaluation techniques. We detail in the following 
the results obtained for the evaluation and analyze the link 
prediction efficiency. Then, we compare the performance of 
the proposed framework to those of major link prediction 
metrics in order to justify the use of the Katz measure. 

1 ) Evaluation of the link prediction technique: To evaluate 
the efficiency of our proposal, we plot the ROC curves 
(Receiver Operating Characteristic curves) l32l . In Fig. [2] we 
depict the ROC curves obtained after performing prediction 
on the Dartmouth Campus trace and for different tensor slice 



TABLE I: Evaluation metrics for the prediction of all links 
applied on Dartmouth Campus trace 



TABLE III: Table of confusion of a binary prediction tech- 
nique 



Metrics 

Prediction CasesT~~~~ — -— _____ 


AUC 


Top Scores Ratio at T+l 


Distributed Case and t=5 mins 
Centralized Case and t=5 mins 


0.9932 
0.9905 


93.70% 
93.61% 


Distributed Case and t=10 mins 
Centralized Case and t=10 mins 


0.9915 
0.9883 


90.26% 
90.19% 


Distributed Case and t=30 mins 
Centralized Case and t=30 mins 


0.9813 
0.9764 


82.31% 
82.56% 


Distributed Case and i=60 mins 
Centralized Case and t=60 mins 


0.9687 
0.9636 


76.10% 
75.94% 



Actual value 
Prediction outcome~~~~ ——- — _____ 


Positive 


Negative 


Positive 


True Positive (TP) 


False Positive (FP) 


Negative 


False Negative (FN) 


True Negative (TN) 



TABLE II: Evaluation metrics for the prediction of all links 
applied on MIT Campus trace 



~~~~ — — — Metrics 
Prediction C^S~~~~~~~-~-___^ 


AUC 


Top Scores Ratio at T+l 


Distributed Case and t-5 mins 
Centralized Case and t=5 mins 


0.9907 
0.9929 


91.48% 
91.48% 


Distributed Case and t=10 mins 
Centralized Case and t=10 mins 


0.9797 
0.9809 


85.18% 
85.14% 


Distributed Case and i=30 mins 
Centralized Case and t=30 mins 


0.9589 
0.9578 


73.31% 
73.76% 


Distributed Case and i=60 mins 
Centralized Case and t=60 mins 


0.9328 
0.9325 


64.54% 
64.54% 



times. Also, adapted metrics are used in order to weigh the 
performance of the proposed link prediction technique. To 
this end, we compute the Area Under the ROC Curve metric 
(AUC metric) |32] which could be considered as a good 
performance indicator in our case. The AUC metric of each 
scenario is determined from the corresponding ROC curve. 
Moreover, we consider the top scores ratio metric at T+l. 
To determine this metric, we compute the accurate number of 
links identified through the link prediction technique. We list, 
for each considered time period, the number of existing links at 
period T+l, which we call L. Then, we extract the links having 
the L highest scores and determine the number of existing 
links in both sets. The evaluation metrics are computed for all 
traces with different tensor slice periods in both distributed and 
centralized scenarios. The results corresponding to all links 
prediction are listed in Table [I] (Dartmouth Campus trace)and 
Table [TT] (MIT Campus trace). 

We first note that, in Fig. [2] and for all scenarios, the 
prediction of all links is quite efficient, compared to the 
random guess (the curve's bends are at the upper left corner). 
We obtain similar ROC curves with the MIT Campus traces 
(we do not present them due to space limitations). Moreover, 
we remark, based on the high values of AUC metric (over than 
0.9) and top scores ratio obtained at T+l, that the prediction 
method is efficient in predicting future links (for the period 
T+l). We also note that prediction is better when the tensor 
slice periods are shorter. This observation is obvious for two 
reasons. On the one hand, with a low tensor slice time, the 
probability of tracking a short and occasional contact between 
two nodes is not likely. On the other hand, recording four 
hours of statistics requires 84 adjacency matrices of 5-minute 
periods instead of 7 matrices for 60-minute periods case. Thus, 
tracking a short contact between two nodes has less influence 
when the tensor slices are more numerous. 

Regarding the comparison between the two ways of com- 
puting the Katz scores, we observe that the centralized and 



distributed matrix of scores computation achieve similar per- 
formances. In fact, the similarity is higher when the paths 
considered between a pair of nodes are short. Thereby, paths 
that have more than two hops have weaker scores and so 
are less weighted compared to shorter ones. The distributed 
case assumes that each node knows its neighbors at most at 
two hops. That is why distributed scores computation presents 
performances which are so similar to the centralized ones. 

2) Prediction Performance Comparison between the 
Tensor-Based Technique and Well-Known Link Prediction 
Metrics: We aim through this subsection to compare our 
proposal to another similar approaches (we use the distributed 
design of our framework to compute the Katz scores). To 
propose a comprehensive comparison, we also propose to 
evaluate the prediction efficiency of well-known prediction 
metrics presented in the literature. On the one hand, we 
consider behavioral-based link prediction metrics as the 
similarity metric of Thakur et al. |4] and two metrics 
expressing mobile homophily proposed by Wang et al. in 
(22): the spatial cosine similarity and the co-location rate. 
On the other hand, we take two link prediction metrics based 
on measuring the degree of proximity as the Katz measure: 
they are the Adamic-Adar measure l23l and the Jaccard's 
coefficient l24l l25l 

To assess the efficiency of each link prediction metric, we 
consider these evaluation measures: 

• Top Scores Ratio in the period T+l (TSR): to deter- 
mine this metric, we compute the percentage of occurring 
links identified through the link prediction technique. We 
list the number of existing links (at period T+l or during 
the periods coming after the period T) which we call L. 
Then, we extract the pair of nodes having the L highest 
scores and determine the percentage of links involved in 
both sets, existing links in both sets. 

• Accuracy (ACC): this measure is defined in l32l as the 
ratio of correct prediction (true positive and true negative 
predictions) over all predictions (true positive, true nega- 
tive, false positive and false negative predictions). In other 
words, it is computed by the ratio tp+ fp+tn+fn ( see 
Table III). We identify for each scenario the maximum 
value of the accuracy which indicates the degree of 
precision that can reach each prediction metric. 

• Precision or Positive Predictive Value (PPV): it repre- 
sents to the proportion of links with positive prediction 
(occurring in the future) which are correctly identified 
El. Based onTablelT" 



III 



TP 



, the precision is equal to TP+FP • 
This value is determined according to the deduced accu- 
racy value. 

Recall or True Positive Rate (TPR): it quantifies the 
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(b) 10 minutes tensor slice period 
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(c) 30 minutes tensor slice period 



(d) 60 minutes tensor slice period 



Fig. 2: ROC Curves for different prediction cases applied on Dartmouth Campus trace 



TABLE IV: Evaluation metrics for the prediction applied on 
the Dartmouth Campus trace 



Period 


Prediction Score 


TSR in 


Accuracy 


Precision 


Recall 


F- 


length 




T+l 




(PPV) 


(TPR) 


measure 




Thakur's Metric 


41.39% 


99.11% 


36.40% 


11.57% 


0.1756 




Spatial Cosine Sim. 


66.01% 


99.45% 


67.44% 


63.75% 


0.6554 


5 mins 


Co-Location Rate 


68.96% 


99.50% 


73.98% 


60.71% 


0.6669 


(T=96) 


Adamic-Adar Meas. 


83.81% 


99.74% 


82.58% 


85.57% 


0.8405 




Jaccard's Coeff. 


82.54% 


99.72% 


81.27% 


85.08% 


0.8313 




Katz Measure 


90.88% 


99.86% 


90.59% 


91.87% 


0.9123 




Thakur's Metric 


43.29% 


99.10% 


37.31% 


11.15% 


0.1717 




Spatial Cosine Sim. 


66.71% 


99.45% 


68.52% 


62.99% 


0.6564 


10 mins 


Co-Location Rate 


68.78% 


99.49% 


71.50% 


65.63% 


0.6844 


(T=48) 


Adamic-Adar Meas. 


81.01% 


99.68% 


78.87% 


84.00% 


0.8135 




Jaccard's Coeff. 


79.75% 


99.66% 


78.04% 


82.83% 


0.8036 




Katz Measure 


86.39% 


99.78% 


89.75% 


82.94% 


0.8621 




Thakur's Metric 


45.18% 


99.06% 


39.08% 


10.83% 


0.1696 




Spatial Cosine Sim. 


67.35% 


99.42% 


67.60% 


67.00% 


0.6730 


30 mins 


Co-Location Rate 


67.78% 


99.45% 


72.47% 


61.33% 


0.6644 


(T=16) 


Adamic-Adar Meas. 


71.82% 


99.50% 


71.25% 


73.86% 


0.7253 




Jaccard's Coeff. 


71.34% 


99.50% 


72.63% 


69.65% 


0.7111 




Katz Measure 


79.83% 


99.64% 


80.09% 


79.48% 


0.7978 




Thakur's Metric 


46.39% 


99.04% 


41.39% 


10.61% 


0.1689 




Spatial Cosine Sim. 


67.55% 


99.40% 


68.51% 


65.70% 


0.6708 


60 mins 


Co-Location Rate 


68.11% 


99.42% 


72.21% 


60.31% 


0.6573 


(T=8) 


Adamic-Adar Meas. 


65.98% 


99.38% 


69.73% 


57.42% 


0.6298 




Jaccard's Coeff. 


67.00% 


99.47% 


68.31% 


64.53% 


0.6637 




Katz Measure 


74.09% 


99.53% 


75.33% 


72.84% 


0.7406 



TABLE V: Evaluation metrics for the prediction applied on 
the MIT Campus trace 



Period 


Prediction Score 


TSR in 


Accuracy 


Precision 


Recall 


F- 


length 




T+l 




(PPV) 


(TPR) 


measure 




Thakur's Metric 


58.22% 


99.29% 


65.58% 


44.96% 


0.5335 




Spatial Cosine Sim. 


60.87% 


99.34% 


72.56% 


44.17% 


0.5491 


5 mins 


Co-Location Rate 


69.35% 


99.49% 


77.79% 


60.71% 


0.6820 


(T=96) 


Adamic-Adar Meas. 


84.20% 


99.72% 


84.22% 


84.36% 


0.8429 




Jaccard's Coeff. 


82.18% 


99.68% 


83.11% 


81.12% 


0.8210 




Katz Measure 


90.14% 


99.86% 


95.29% 


89.02% 


0.9205 




Thakur's Metric 


57.70% 


99.27% 


65.25% 


44.58% 


0.5569 




Spatial Cosine Sim. 


60.50% 


99.32% 


72.56% 


43.18% 


0.5414 


10 mins 


Co-Location Rate 


68.74% 


99.46% 


76.50% 


60.08% 


0.6730 


(T=48) 


Adamic-Adar Meas. 


80.04% 


99.63% 


79.31% 


80.87% 


0.8008 




Jaccard's Coeff. 


77.97% 


99.59% 


80.53% 


73.77% 


0.7700 




Katz Measure 


86.83% 


99.78% 


86.62% 


87.25% 


0.8693 




Thakur's Metric 


56.73% 


99.20% 


62.87% 


46.14% 


0.5322 




Spatial Cosine Sim. 


59.35% 


99.26% 


72.65% 


40.51% 


0.5202 


30 mins 


Co-Location Rate 


65.03% 


99.39% 


80.75% 


49.93% 


0.6171 


(T=16) 


Adamic-Adar Meas. 


67.07% 


99.35% 


67.77% 


64.55% 


0.6572 




Jaccard's Coeff. 


66.34% 


99.39% 


78.56% 


51.97% 


0.6214 




Katz Measure 


72.85% 


99.47% 


88.30% 


53.86% 


0.7279 




Thakur's Metric 


55.70% 


99.08% 


63.51% 


41.85% 


0.5045 




Spatial Cosine Sim. 


57.57% 


99.14% 


72.82% 


37.22% 


0.4926 


60 mins 


Co-Location Rate 


61.71% 


99.24% 


77.89% 


45.10% 


0.5712 


(T=8) 


Adamic-Adar Meas. 


59.13% 


99.14% 


57.52% 


65.48% 


0.6124 




Jaccard's Coeff. 


58.95% 


99.22% 


71.27% 


45.73% 


0.5571 




Katz Measure 


61.00% 


99.28% 


74.90% 


50.09% 


0.6003 



ratio of correctly identified links over the occurring links 
in the future |32|. Referring to Table 
defined by the expression 



TP 



III the recall is 



TP+FN' ^his vame i s a l so 

computed according to the retained accuracy value. 
• F-measure or balanced Fl score: the F-measure l33l 
is the harmonic mean of precision and recall. The F- 
measure is expressed by 2. precision. recall The M h 

* J precision-\- recall & 

the F-measure is, the better the tradeoff of precision and 
recall gets and the more efficient the prediction metric is. 

The evaluation metrics are computed for all traces with 
different tracking periods lengths (5, 10, 30 and 60 minutes). 
For each trace, we track the network topology from 8 a.m. to 
4 p.m. We divide, as previously, the historical into T periods 
and we focus on predicting the links occurring in the period 
T+l. Regarding the Dartmouth Campus trace, the results are 



reported in Table IV For the MIT Campus trace, the prediction 
results are listed in Table M 



The results obtained enable us to attest that the use of the 
Katz measure has been one of the best choices to perform 
prediction through the tensor-based technique. Using this 
metric achieves better performance than those of the other 
link prediction metrics proposed in the literature. Hence, the 
Katz measure is the best metric that we can use to perform 
link prediction. 

The prediction made through the Katz measure achieves 
better performance than those of mobility homophily met- 
rics and Thakur et al.'s similarity. Indeed, our framework 
quantifies the similarity of nodes based on their encounters 
and geographical closeness. In other words, the proposed 
prediction method cares about contacts (or closenesses) at 
(around) the same location and at the same time. Meanwhile, 
the mobility homophily metrics and Thakur et al.'s similarity 
are defined as an association metric. Hence, they measure the 
degree of similarity of behaviors of two mobile nodes without 



necessarily seeking if they are in the same location at the same 
time. Regarding the comparison with the other network prox- 
imity metrics, the Katz measure quantifies better the behavior 
similarity of two nodes as it takes into consideration only the 
paths that separate them. Meanwhile, the Adamic-Adar metric 
and the Jaccard's coefficient are dependent respectively on the 
degree of common neighbors between two nodes and the size 
of the intersection of the neighbors of two nodes. These latter 
metrics express similarity based on common neighbors of two 
nodes but don't seek if a link is occurring between them. This 
criterion highly influences the value of Katz measure and make 
it more precise. 

V. Conclusion 

Human mobility patterns are mostly driven by social inten- 
tions and correlations appear in the behavior of people forming 
the network. These similarities highly govern the mobility of 
people and then directly influence the structure of the network. 
The knowledge about the behavior of nodes greatly helps in 
improving the design of communication protocols. Intuitively, 
two nodes that follow the same social intentions over time 
promote the occurrence of a link in the immediate future. 

In this paper, we presented a link prediction technique 
inspired by data mining and exploit it in the context of 
human-centered wireless networks. Through the link predic- 
tion evaluation, we have obtained relevant results that attest 
the efficiency of our contribution and agree with some findings 
referred in the literature. 

Good link prediction offers the possibility to further improve 
opportunistic packet forwarding strategies by making better 
decisions in order to enhance the delivery rate or limiting 
latency. Therefore, it will be relevant to supply some rout- 
ing protocols with prediction information and to assess the 
contribution of our approach in enhancing the performance of 
the network especially as we propose an efficient distributed 
version of the prediction method. The proposed technique also 
motivates us to inquire into future enhancements as a more 
precise tracking of the behavior of nodes and a more efficient 
similarity computation. 
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